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Abstract. 

This  paper  presents  a  detailed  error  analysis  of  geometric  hashing  in  the  domain  of  2D  object 
recogition.  Earlier  analysis  has  shown  that  these  methods  are  likely  to  produce  false  positive 
hypotheses  when  one  allows  for  uniform  bounded  sensor  error  and  moderate  amounts  of  extraneous 
clutter  points.  These  false  positives  must  be  removed  by  a  subsequent  verification  step.  Later  work 
has  incorporated  an  explicit  2D  Gaussian  instead  of  a  bounded  error  model  to  improve  performance 
of  the  hashing  method. 

The  contribution  of  this  paper  is  to  analytically  derive  the  probability  of  false  positives  and 
negatives  as  a  function  of  the  number  of  model  features,  image  features,  and  occlusion,  under  the 
assumption  of  2D  Gaussian  noise  and  a  particular  method  of  evidence  accumulation.  A  distinguish¬ 
ing  feature  of  this  work  is  that  we  make  no  assumptions  about  prior  distributions  on  the  model 
space,  nor  do  we  assume  even  the  presence  of  the  model.  The  results  are  presented  in  the  form  of 
ROC  (receiver-operating  characteristic)  curves,  from  which  several  results  can  be  extracted;  firstly, 
they  demonstrate  that  the  2D  Gaussian  error  model  performs  better  for  high  clutter  levels  and 
degrades  more  gracefully  as  compared  to  the  uniform  bounded  error  model  for  the  same  conditions. 
They  also  directly  indicate  the  optimal  performance  that  can  be  achieved  for  a  given  clutter  and 
occlusion  rate,  and  how  to  choose  the  thresholds  to  achieve  the  desired  rates. 

Lastly,  we  verify  these  ROC  curves  in  the  domain  of  simulated  images. 
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1  Introduction 


Geometric  hashing  is  a  technique  introduced  in  (LSW87], 
(HW88],  to  solve  the  problem  of  recognizing  objects  and 
their  associated  poses  in  cluttered  scenes.  The  main  idea 
behind  the  technique  is  that  instead  of  checking  every 
possible  correspondence  of  image  to  model  features  to 
establish  a  model  pose  and  then  checking  the  image  for 
supporting  evidence,  the  recognition  process  is  consid¬ 
erably  sped  up  by  splitting  it  into  two  stages.  In  the 
first  stage,  a  database  of  all  possible  views  of  the  model 
are  precomputed  and  stored  in  a  hash  table.  Recogni¬ 
tion  consists  of  using  2D  image  features  to  index  into 
the  hash  table  in  order  to  vote  for  possible  model  poses. 

However,  under  the  assumption  of  uniform  bounded 
sensor  error,  performance  degrades  rapidly  with  even  a 
moderate  amount  of  clutter  [GHJ91].  Intuitively,  the 
reason  is  that  the  error  causes  the  point  entries  in  the 
hash  table  to  blur  into  regions,  making  the  table  denser 
and  increasing  the  chances  that  a  random  image  point 
(i.e.,  a  point  not  arising  from  the  model)  will  corroborate 
an  incorrect  hypothesis. 

In  this  paper  we  analyze  the  effect  of  a  more  realis¬ 
tic  noise  model  on  these  techniques.  The  question  we 
address  in  the  paper  is,  what  kind  of  performance  can 
we  expect  from  the  techniques  as  a  function  of  the  num¬ 
ber  of  model  features  and  clutter  features  (i.e.,  signal  to 
noise  ratio)? 

To  answer  the  question,  first  we  briefly  present  the 
original  hashing  algorithms,  then  we  show  how  to  mod¬ 
ify  them  in  the  presence  of  sensor  error.  We  model  the 
error  as  a  2D  Gaussian  distributed  vector,  which  is  often 
a  more  realistic  model  than  the  uniform  bounded  error 
model  used  in  the  earlier  analysis  [GHJ91].  A  voting 
function  for  accumulating  evidence  for  hypotheses  based 
on  the  error  model  is  presented.  (Similar  approaches 
to  extending  geometric  hashing  have  been  explored  in 
[CHS90],  [RH91}.)  This  is  the  background  for  main  ques¬ 
tion,  wUch  is,  how  does  one  determine  a  reliable  point 
at  which  to  separate  correct  from  incorrect  hypotheses? 
This  question  is  relevant  in  the  noiseless  case  as  well:  as¬ 
sume  there  is  a  25%  occlusion  rate,  and  we  are  searching 
for  a  model  of  size  20.  Do  we  decide  that  a  hypothesis  is 
true  after  seeing  15  corroborating  features,  or  12,  or  10? 
Clearly,  the  lower  the  acceptance  threshold,  the  higher 
the  probability  of  false  positives,  zmd  the  higher  the  ac¬ 
ceptance  threshold,  the  higher  the  probability  that  we 
will  miss  a  correct  hypothesis,  t.e.  of  false  negatives. 

To  find  the  optimal  acceptance  threshold  for  a  fixed 
occlusion  rate  and  a  fixed  number  of  model  and  clutter 
points,  we  use  the  given  error  model  and  voting  scheme 
to  derive  expressions  for  the  probability  density  func¬ 
tions  of  weights  of  positive  and  negative  hypotheses.  We 
then  vary  the  acceptance  threshold  and  find  the  proba¬ 
bility  of  false  positives  and  true  positives  for  that  thresh¬ 
old.  The  results  are  plotted  as  ROC  curves,  which  indi¬ 
cate  the  optimal  performance  that  can  be  achieved  for 
the  given  level  of  occlusion,  clutter,  and  number  of  model 
points. 


2  Statement  of  the  Geometric  Hashing 
Algorithm 

We  begin  by  reviewing  the  original  geometric  hash¬ 
ing  algorithm  assuming  exact  measurements  [L.SW87]. 
[HW88].  The  algorithm  consists  of  two  stages,  a  model 
preprocessing  stage  and  a  recognition  stage.  For  simplic¬ 
ity,  we  restrict  attention  to  planar  objects  in  arbitrary 
3D  pose.  The  model  representation  consists  of  a  set  of 
(x.y)  points  in  what  we  will  call  model  space,  which  is 
simply  some  fixed  coordinate  system.  The  points  can  be 
corners,  points  of  high  curvature,  or  points  of  inflection 
of  the  2D  model. 

Assuming  orthographic  projection,  we  can  repre¬ 
sent  the  image  location  [ui,t;i,  1]^  of  each  model  point 
(^ti  t/i,  1]^  with  a  simple  linear  transformation. 
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where  the  upper  left  of  the  transformation  matrix  is  a 
2x2  non-singular  matrix,  and  [G,  ty]^  is  the  translation 
vector.  This  is  because  the  projection  onto  the  z  =  0 
plane  of  a  rotated,  scaled,  and  translated  point  (z,  y,  0, 1 ) 
simplifies  to 
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where  5  is  a  positive  scale  factor.  It  is  a  well  known 
fact  that  if  a  point  has  coordinates  X  with  respect  to 
a  given  basis,  then  a  linear  transformation  on  the  entire 
space  leaves  the  coordinates  of  the  point  unchanged  with 
respect  to  the  transformed  coordinates  of  the  basis.  The 
coordinates  of  X  with  respect  to  the  basis  are  called 
affine  coordinates,  and  it  is  their  invariance  under  linear 
operations  which  is  utilized  in  geometric  hashing. 

In  the  preprocessing  stage,  the  hash  table  is  con¬ 
structed  as  follows:  Every  ordered  triple  of  model  points 
is  used  as  a  basis,  and  the  affine  coordinates  (a,  0)  of  all 
other  model  points  are  computed  with  respect  to  each 
basis.  Thus,  if  mo,  mi  and  mj  are  basis  points,  then  we 
represent  any  other  feature  point  by 

mj  =  mo  -F  ati(mi  -  mo)  +  I3i(rh2  -  mo). 

The  basis  (t.e.,  the  3  model  points)  is  entered  into 
the  hash  table  at  each  (a,, /I?,)  location.  Intuitively,  the 
invariance  of  the  affine  coordinates  of  model  points  with 
respect  to  3  of  its  own  points  as  basis  is  being  used  to 
“precompute”  all  possible  views  of  the  model  in  an  im¬ 
age.  The  actual  algorithm  is: 

•  for  every  ordered  model  triplet  Bk  =  (mo,mi,m2), 
-  for  every  other  model  point  my 

(i)  find  coordinates  my  =  (a j,0j)  with  respect 
to  basis  Bk 
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(ii)  enter  basis  Bt  at  location  (ai,Bj)  in  the 
hash  table. 

The  running  time  for  this  stage  is  O(m^),  where 
m=number  of  model  points. 

At  recognition  time,  the  image  is  processed  to  ex¬ 
tract  2D  feature  points  which  are  used  to  index  into  the 
table.  The  choice  of  features  used  must  be  determined 
by  what  points  were  used  as  model  feature  points,  t.e., 
if  corners  were  used  as  model  features,  then  one  might 
take  the  intersection  of  all  line  segments  to  be  the  im¬ 
age  feature  points.  Every  image  triple  is  then  taken  to 
be  a  basis,  and  the  affine  coordinates  of  all  other  image 
points  is  computed  with  respect  to  the  basis  to  index 
into  the  hash  table  and  “vote”  for  all  bases  found  there. 
Intuitively  we  are  searching  for  any  three  image  points 
which  come  from  the  model,  and  using  the  hash  table  to 
verify  hypothesized  triples  of  image  points  as  instances 
of  model  points.  Such  an  image  triple  will  yield  a  large 
number  of  votes  for  its  corresponding  model  basis.  In 
particular: 

•  for  every  unordered  image  triplet  (io,»i,«2) 

(a)  for  every  other  image  point  ij 

(i)  find  coordinates  ij  =  {aj,^j)  with  respect 

to  basis  t’a) 

(ii)  Index  into  the  hash  table  at  location  {aj,0j) 
and  increment  a  histogram  count  for  dl 
bases  found  there. 

(b)  If  the  weight  of  the  vote  for  any  basis  Bjt  is  suf¬ 
ficiently  high,  stop  and  output  the  correspon¬ 
dence  between  triple  (to,ti,t2)  and  basis  Bk  as 
a  correct  hypothesis. 

In  some  versions  of  the  algorithm,  the  hypothesis  that 
is  output  subsequently  undergoes  a  verification  stage  be¬ 
fore  being  accepted  as  correct.  Note  that  we  n^  to 
order  the  points  either  at  the  preprocessing  stage  or  at 
recognition  time,  but  not  both  (or  there  would  be  a  six¬ 
fold  redundancy  of  correspondences).  We  choose  to  or¬ 
der  the  points  at  the  preprocessing  stage  and  enter  every 
model  point  with  respect  to  a  single  unordered  basis  set 
6  times,  once  for  every  ordering  of  the  basis  set.  This 
makes  the  table  6  times  denser,  but  then  at  recognition 
time  we  need  only  to  choose  an  unordered  image  triple 
and  impose  a  single  arbitrary  ordering  upon  it.  This 
way,  when  we  use  the  remaining  image  points  to  index 
into  the  hash  table,  we  vote  for  the  ordering  of  the  model 
basis  set  as  well  as  model  basis  set  itself.  The  termina¬ 
tion  condition  for  accepting  a  correspondence  of  bases 
(and  hence  a  pose  of  the  object)  and  the  confidence  of 
the  result  are  exactly  the  issues  we  investigate  in  this 
paper. 

3  Modifications  to  the  Algorithms  in 
the  Presence  of  Error 

We  now  assume  sensor  uncertainty,  namely,  that  a  model 
feature  appears  at  its  projected  location,  but  displaced 
by  an  error  vector  drawn  from  some  distribution.  With¬ 
out  noise,  a  correct  matching  (t.e.,  a  correct  pairing  of  3 
model  basis  points  and  3  image  basis  points)  yields  a  sin¬ 
gle  (x,  y)  location  for  a  projected  fourth  model  point  in 


the  image  and  a  single  (a,0)  location  for  the  same  point 
in  the  hash  table.  Under  the  assumption  of  circular  uni¬ 
form  bounded  error,  [GHJ91]  showed  that  a  matching 
gives  rise  to  a  circular  disk  of  possible  image  locations 
for  any  projected  fourth  model  point,  and  that  this  cir¬ 
cular  disk  in  the  image  translates  to  an  ellipsoidal  range 
of  affine  coordinates  in  the  hash  table.  Therefore,  in 
practice,  the  bases  should  be  stored  (weighted  by  some 
function  of  the  error  distribution)  at  all  possible  affine 
locations  for  the  fourth  point.  However,  it  is  simpler 
to  analyze  the  probability  that  a  uniformly  distributed 
random  point  will  fail  into  a  given  circle,  than  to  trans¬ 
late  the  uniform  distribution  into  a  distribution  on  affine 
coordinates,  and  to  analyze  the  probability  that  the  ran¬ 
dom  point  with  affine  coordinates  drawn  from  this  dis¬ 
tribution  will  fall  into  a  given  ellipse.  It  is  clear  that 
the  answer  is  the  same,  but  that  the  first  space  is  more 
manageable  th2tn  the  second.  We  will  therefore  choose 
to  do  the  analysis  using  the  simpler  space,  keeping  in 
mind  that  the  results  found  in  this  fashion  are  true  of 
the  analysis  done  in  hash  table  space  as  well.  One  con¬ 
sequence  of  this  is  that  the  analysis  will  apply  equally 
well  to  adignment  and  to  geometric  hashing. 

In  the  modified  algorithm,  instead  of  incrementing  a 
histogram  count  for  every  eligible  basis  by  a  full  vote, 
we  increment  the  basis  count  by  a  number  between  0 
and  1  according  to  some  “goodness”  criterion,  which  in 
our  case  is  a  function  of  the  distance  of  the  point  from 
its  expected  location.  Because  of  this,  we  must  look 
at  the  density  function  of  the  accumulated  values  for 
correct  and  incorrect  hypotheses,  instead  of  the  discrete 
probability  of  a  particular  vote.  We  will  use  the  term 
“weight  of  a  hypothesis”  to  denote  this  concept. 

4  Overview  of  the  Analysis 

The  main  claim  of  the  paper  is  supported  by  the  argu¬ 
ment  whose  steps  are  as  follows: 

(a)  A  2D  circular  Gaussian  distribution  often  a  more 
accurate  model  for  sensor  error,  as  opposed  to  a  model 
assuming  bounded  uniform  distribution  (Wel91].  While 
a  bounded  model  leads  to  conservative  estimates  on  per¬ 
formance,  a  Gaussian  model  may  lead  to  more  practical 
estimates. 

(b)  Using  this  Gaussian  distribution,  the  following  is 
true;  given  a  correspondence  b'  tween  3  image  points  and 
3  model  points  (referred  to  a  hypothesis  for  the  rest  of 
the  article),  and  assuming  a  fixed  sttindard  deviation  (To 
for  the  sensed  error  of  the  image  points,  the  location  of 
a  fourth  model  point  with  affine  coordinates  (a,  0)  (with 
respect  to  the  3  image  basis  points)  will  also  have  a  2D 
circular  normal  distribution  with  standard  deviation  tTe: 

<^e  =  O’o((l  -  a- 0)^  +a^  +  0^  +  1)*/* 

Note  that  the  possible  distance  of  a  fourth  model 
point  from  its  predicted  location  is  now  unbounded.  In 
our  scheme  we  will  pick  a  cutoff  search  distance  of  2(r« 
for  possible  matching  image  features,  which  will  imply  a 
probability  of  false  negative  identification  of  13.5%  for  a 
single  point. 

(c)  As  in  [GHJ91],  we  find  the  density  of  Og,  in  one 
case  when  the  values  of  tTg  come  from  a  model  appear- 


ing  in  the  image  (/K(<^e))>  and  in  the  other  case,  on  ffe 
resulting  from  incorrect  hypotheses  (f^<Te))  The  two 
different  density  functions  are 


where  ij  =  0.58,  to  =  0.35. 

(d)  Next,  we  modify  the  recognition  algorithm  so  that 
it  aligns  weights  to  points  found  within  the  error  disk, 
as  opposed  to  a  single  I/O  vote.  We  choose  to  use. 


points,  the  distribution  is; 
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Dropping  n  points  convolves  this  distribution  with  itself 
n  —  3  times: 
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where  d  =distance  from  the  point’s  hypothesized  to  ac¬ 
tual  location.  This  is  the  value  of  the  2D  Gaussian  den¬ 
sity  function  whose  center  is  at  the  hypothesized  loca¬ 
tion. 

(e)  Define  random  variables  V/r  =  the  weight  that 
a  model  point’s  projection  contributes  to  its  supporting 
basis,  and  =  the  weight  that  a  random  image  point 
contributes  to  a  given  basis.  To  demonstrate  what  this 
means,  in  the  simpler  bounded  uniform  error  case,  the 
distribution  of  Vh  is: 


['(1-c)  t;=l 
/(Vff  =v)  =  <c  tj  =  0 

V  0  otherwise 


i.e.,  the  probability  that  a  fourth  model  point  will 
contribute  a  weight  of  1  to  a  correct  hypothesis  is  1  —  c, 
where  c  is  the  probability  of  occlusion.  A  more  compli¬ 
cated  expression  holds  for  [GHJ91]. 

In  the  Gaussian  error  scheme  with  a  cutoff  distance 
of  2<7e  these  distributions  ate; 


fiya  =  t;)  = 


=  = 


c-t-e~^(l  — c)  v=0 


l-e)2»  <.-1 


<  W  <  fz 
is  <V  <(4 

otherwise 


^  ~  l)v'2iru 

U 


v=0 

ii  <  V  <  £2 
fj  <  i;  <  fz 

fz  <  V  <  £4 

otherwise 


where 

2ir«2^e^  27rs2^ 


2irsi^e^  2jrsi2 

and  St ,  52  are  the  minimum  and  maximum  allowable  val¬ 
ues  for  fft ,  respectively. 

(f)  The  probability  density  function  for  the  weight 
of  an  incorrect  hypothesis  is  calculated  as  follows;  For  a 
single  random  point  in  an  image  with  m  projected  model 


i  =  l 

For  a  model  of  size  m  and  a  correct  hypothesis  in 
an  image  with  n  points,  the  weight  of  the  total  vote 
for  this  hypothesis  is  the  sum  of  weights  over  all  m  -  3 
other  projected  model  points  plus  the  sum  of  the  weights 
of  the  n  —  m  clutter  points.  We  will  call  this  random 
variable  W/,.,.  =  Vh,  +  •  Though  the 

random  variables  Vh,  are  not  independent,  we  make  the 
simplifying  assumption  that  they  are,  and  proceed  with 
the  analysis.  Assuming  independence,  the  sum  follows 
the  distribution; 

m— 3  n— m 

fiWH..^  =  V)  =  (g)  nvH,)®  (g)  /(%.) 

t=i  1=1 

The  validity  of  this  assumption  will  be  examined  in  a 
later  section  of  this  paper.  We  will  use  the  central  limit 
theorem  to  avoid  actually  having  to  compute  this  distri¬ 
bution,  and  will  assume  that  the  result  of  the  convolution 
is  Gaussian. 

(g)  Given  these  two  distributions,  we  can  now  find  the 
probability  that  an  incorrect  hypothesis  will  look  like  a 
correct  one.  The  problem  of  deciding  whether  a  sen¬ 
sor  basis  corresponds  to  a  particular  model  basis  is  a 
simple  binary  hypothesis  testing  problem,  for  which  we 
can  easily  find  an  optimum  decision  rule.  We  postpone 
the  discussion  of  this  rule  until  a  later  section;  for  now 
we  will  simply  state  that  the  decision  rule  yields  a  fixed 
probability  of  false  positive  (Pp)  versus  detection  (Pp) 
as  a  function  of  threshold.  It  is  also  shown  that  this 
decision  rule  performs  better  for  high  clutter  levels  and 
degrades  more  gracefully  as  compared  to  the  analogous 
optimal  decision  rule  in  the  uniform  bounded  error  case. 

(h)  Now  let  us  step  back  and  look  at  the  overall  de¬ 
cision  problem.  We  pick  three  image  points,  and  accu¬ 
mulate  weights  for  Cj)  ♦  6  bases.  Suppose  we  are  willing 
to  verify  (by  alignment  or  any  other  verification  tech¬ 
nique)  all  bases  that  pass  the  initial  test,  as  long  as  there 
are  <  k  of  them.  Then,  an  overall  false  positive  is  the 
combined  event  that  the  three  image  points  being  tested 
do  not  arise  from  the  model,  yet  more  than  k  model 
bases  “look  good”.  An  overall  true  positive  is  the  com¬ 
bined  event  that  the  three  image  points  do  arise  from 
the  model,  that  <  k  model  bases  pass  the  test,  and  of 
these,  one  of  them  is  the  correct  one.  We  will  call  these 
combined  events  Qp  and  Do,  and 


Multiplying  by  a  scalar  yields 


k 

p{Qf)  = 

i=0 

t-1 

P(no)  =  Pd*5]pMi-P/’)(“)~‘ 

i=0 

The  following  sections  show  the  derivation  of  these 
distributions,  and  the  results  of  the  analysis  both  ana¬ 
lytically  and  empiric^dly. 

5  Deriving  the  Projected  Gaussian 
Distribution 

In  [GHJ91]  analytic  expression  for  the  case  of  circular 
error  disks  was  derived  as  follows,  given  3  model  points 
(with  model  space  coordinates)  as  basis,  and  the  affine 
coordinates  of  a  fourth  model  point  with  respect  to  this 
basis,  the  expression  for  the  coordinates  of  the  fourth 
point  in  model  space  is 

=  rni  -1-  ar(rn2  -  *«i)  +  -  fni)- 

Under  an  arbitrary  affine  transformation  T,  each  model 
point  projects  to  the  location 

Si  =  Triii  +  el 


f{ca  =  (j-,y)) 


/{flr  = 


X 


c 


1 


\/2ir  (c<Tr) 

1  - 


\/^(C£Ty) 


Therefore,  assuming  el  to  be  2D  Gaussian  with  0  co- 
variance  and  standard  deviations  ffn  =  <Tjy  =  a,  the 
distribution  of  the  vector  in  equation  (1)  is  a  2D  Gaus¬ 
sian  with  covariance  0  and  standard  deviation: 


in  both  the  x  and  y  direction.  Because  the  Gaussian 
distribution  is  not  bounded,  we  choose  to  terminate  the 
search  for  points  after  a  radius  of  2(rt,  which  means  that 
we  will  find  an  image  feature  arising  from  a  model  point 
86.5%  of  the  time  (this  is  demonstrated  in  a  later  sec¬ 
tion).  Note  that  this  expression  is  always  smaller  than 
its  analogous  expression  for  disk  radius  in  the  uniform 
bounded  error  model  from  equation  (2)  because  of  the 
triangle  inequality.  In  the  comparison,  e  =  2(t. 


6  Determining  the  Distribution  for  <Tg 


where  el  is  a  vector  drawn  from  the  error  distribution. 
The  possible  location  of  the  fourth  model  point  is  found 
by  plugging  the  first  expression  into  the  second  equation, 
to  yield 

S4  =  Tm^  +  «4 


where 


^4  ~  {I  -  a  ~  I3)ei  +  062  + 063  + e'i.  (1) 

When  the  error  vector  is  drawn  from  a  uniform  circular 
distribution  with  radius  c,  the  expression  for  the  pro¬ 
jected  error  vector  is  found  to  be 

e[|  1  -  a  -  /?  I  -hi  a  I  -Hi  /3  1  -f-1]  (2) 


For  this  paper,  the  sensor  error  vector  is  drawn  from  a 
two  dimensional  circular  Gaussian  distribution.  The  2D 
Gaussian  probability  density  of  a  remdom  variable  a  with 
0  covariance  is  denoted  as; 


f{a  =  {x,y)) 


27r<r,<r, 


/(or  =  x)f{ay  =  y) 


Because  the  two  components  are  independent,  the  prob¬ 
ability  density  of  the  sum  of  two  random  variables  with 
2D  Gaussian  distribution  and  0  covariance  is: 


f{a  +  b  =  (ar,  y))  =  /(a*  +  =  x,ay  +  by  =  y) 


Convolution  in  each  dimension  yields: 


f{5  +  b  =  (ar,  y)) 

v/2ir((T2,  + 

,3 

1  *"  I 

•  -  -  — 


In  the  analysis  we  use  two  different  probability  densities 
for  iTe ,  one  for  correct  basis  matchings  and  one  for  incor¬ 
rect  basis  matchings.  Intuitively,  this  is  due  to  the  fact 
that  when  an  incorrect  basis  matching  is  tested,  more  of¬ 
ten  than  not  the  projected  model  points  fall  outside  the 
image  range  and  are  thrown  away,  while  when  a  correct 
hypothesis  is  tested  the  remaining  model  points  always 
project  to  within  the  ima^e.  In  tests  we  have  observed 
that  over  half  of  the  incorrect  hypotheses  are  rejected  for 
this  reason,  leading  to  an  altered  density  for  Ot 

Let  us  call  the  two  distributions  ^nd  f-ffiiTc). 

We  empirically  estimate  the  former  distribution  by  gen¬ 
erating  a  random  model  of  size  25,  then  for  each  ordered 
triple  of  model  points  as  basis,  we  increment  a  histogram 
for  the  value  of  (Te  as  a  function  of  a  and  0  for  all  the 
other  model  points  with  respect  to  that  basis.  For  the 
latter  distribution,  we  generate  a  random  model  of  size 
4  and  a  remdom  image,  and  histogr^un  the  values  of 
for  only  those  cases  in  which  the  initial  basis  matching 
causes  the  remaining  model  point  to  fall  within  the  im¬ 
age.  The  distributions  for  <Te  found  in  this  manner  have 
been  observed  to  be  invariant  over  many  different  values 
of  model  and  image  points. 

The  model  is  constrained  such  that  the  maximum  dis¬ 
tance  between  any  two  model  points  is  not  greater  than 
10  times  the  minimum  distance,  and  in  the  basis  selec¬ 
tion,  no  basis  is  chosen  such  that  the  angle  ip  between 
the  two  axes  is  0  <|  0  |<  ^  or  y|x  <  ip  <  This  is 
done  to  avoid  unstable  bases. 

The  results  were  almost  identical  in  every  test  we  ran; 
two  typical  normalized  histogram  are  shown  in  figure  1 
For  a  choice  of  <t  =  2.5,  the  histograms  very  closely  fit 
the  curves  /jf(ff«)  =  (6i<t«)~^,  =  0.58,  and  fjj  = 

(620*)“^,  62  =  0-35  between  the  ranges  si  =  2.875  and 
$2  =  120.  Figure  1  shows  the  estimated  density  functions 


shown  superimposed  on  the  empirical  distributions.  The 
integral  of  the  analytic  expression  thus  defined  =  1.009 
and  0.975,  respectively. 

7  Derivation  of  the  Single  Point 
Distributions 

In  this  section  we  show  the  derivation  of  the  distributions 
f{Vn  =  v),  the  density  function  on  the  values  that  an 
image  point  contributes  to  a  model  basis  given  that  the 
point  comes  from  the  model,  and  /(V^  =  v),  the  density 
function  on  the  values  that  an  image  point  contributes 
to  a  basis  given  that  it  is  a  random  point.  We  begin  with 
the  former. 


7.1  Deriving /(Vtf) 

Given  a  correct  hypothesis  and  no  occlusion,  the  location 
of  a  projected  model  point  can  be  modeled  as  a  vector 
d  centered  at  the  predicted  location  with  Gaussian  dis¬ 
tribution  (expressed  in  polar  coordinates) 

where  we  know  ffg  and  its  distribution.  We  now  choose 
an  evaluation  function  ^(d),  which  we  use  to  weight 
a  match  that  is  offset  by  d  from  the  predicted  match 
location.  We  want  to  find  its  density,  i.e.,  we  want 
f(v  =  g(d)),  where  the  distribution  of  d  is  as  stated. 
As  mentioned,  we  choose  the  evaluation  function 

Si(d  =  (r,«))  = 

Since  the  evaluation  function  g  is  &  really  function  of  r 
alone,  we  need  to  know  the  density  function  of  r.  To  find 
this,  we  integrate  /(r,  $)  over  6: 


f{r) 


Next,  we  want  to  find  the  density  of  the  weight  func¬ 
tion  V  =  g(r).  The  change  of  variables  formula  for  a 
monotonically  decreasing  function  is: 


dens(y(r)  =  v) 


9'(g-^(v)) 


Working  through  the  steps,  we  find 


»(»•) 

= 

2ir<T? 

9'{.r) 

= 

/(»•) 

= 

r 

= 

2irrg(r) 

f{9-Hv)) 

= 

2jrg-^(v)g(g-^(v)) 

= 

2irvg~^(v) 

= 

=►  =  9(r)}  = 


a 

-fia'Hi)) 

2irvg-^(v) 

I'tf  Ml) 


It  may  seem  counterintuitive  that  the  resulting  dis¬ 
tribution  is  constant.  However,  this  can  be  understood 
if  one  considers  an  example  in  which  /(r,9)  is  uniformly 
distributed.  Integrating  over  all  angles  yields  a  linearly 
increasing  function  in  r.  Assigning  an  evaluation  func¬ 
tion  g(d)  which  is  inversely  proportional  to  r  yields  a 
constant  density  function  on  /(v).  The  same  thing  is 
happening  here,  only  quadratically.  Since  we  only  search 
for  a  match  out  to  a  radius  of  2(7^,  the  effective  distribu¬ 
tion  is: 


/(Vh)  = 


Ju=g(oo)  « 

= 

0 


u  =  0 
Otherwise 


t.e.,  we  will  miss  a  good  point  e~^  =  13.5%  of  the  time 
This  expression  correctly  integrates  to  1.  Now,  note  that 
in  the  expression  we  have  a  fixed  (Tc,  i.e.,  we  actually 
have  derived  f(v  =  g(r)  |  ff,).  We  need  to  integrate  this 
expression  over  all  values  of  it*,  that  is. 


/(Vf,  =v)  =  J  f(VH  =  v\L  =  or)/H(E  =  ff}dcr 


There  ue  two  things  to  take  into  consideration  when 
calculating  the  limits  for  this  expression:  first,  the  possi¬ 
ble  VEdues  of  (Tf  range  from  a  lower  limit  si  to  an  upper 
limit  S2 ,  due  to  limits  on  the  values  of  the  affine  coordi¬ 
nates.  (Earlier,  we  saw  for  =  2.5,  that  si  =  2.875,  S2  = 
120).  Also,  for  a  given  tTg,  it  is  clear  that  the  maxi¬ 
mum  value  we  can  achieve  is  when  r  =  0  ^  u  = 
and  the  minimum  value  we  can  achieve  is  at  the  cutolf 
point  r  =  2(7 e  ^  v  =  ■  Setting  v  to  each  of 

these  expressions  and  solving  for  cr,  leads  to  the  con¬ 
clusion  that  for  a  particular  value  v,  the  only  values  for 
CTf  such  that  y(d  |  tr,)  could  equal  v  are  in  the  range 
(  Ty-).  Therefore  the  lower  bound  on  the  inte- 

gral  is  (Te  =  max(si,  and  the  upper  bound  is 

<Te  =  min(^^,S2).  We  split  this  integral  into  3  re¬ 
gions,  and  deal  with  the  case  where  v  =  0  separately 
Integrating,  we  get: 


=  v) 
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e~^ 

2ir 


|f(^2- 


2ir  e-1 
e’/2Tv 

0 


V=0 

<V<t2 
i2<V<t3 
is  <  V  <U 
otherwise 


Figure  1:  The  distributions  ///((Tj)  and  f-jf{(re) 


where 


1 

1 

Cl  — 

2trs2^c^ 

1 

C2  — 

/a  — 

2irs2^ 

1 

C3  — 

2irsi2e2 

C4  — 

2irs\^ 

7.1.1  Adding  Occlusion 

It  is  easy  to  add  occlusion  into  this  distribution  by 
considering  an  independent  process  whose  probability  of 
occluding  any  given  point  is  c.  Therefore,  the  above 
distribution  is  multiplied  by  another  factor; 

/rpr  =  0)(l-c)  +  c  «=0 

^  ^  \  /{Vh  =  t;)c  otherwise 

We  will  use  the  distribution  /,  not  /e,  in  the  rest  of 
the  paper,  and  will  reconsider  the  rate  of  occlusion  only 
in  the  context  of  calculating  false  negatives  in  a  later 
section. 

7.2  Deriving  /(V^) 

We  do  the  same  derivation  for  the  distribution 
Given  a  hypothesis  and  a  random  point,  we  calculate 
the  distribution  as  follows:  let  event  A  =  “point  falls  in 
hypothesized  error  disk”.  This  is  the  area  of  the  error 
disk  over  the  size  of  the  image  R"^,  t.e., 

I  X  4?rcr* 

P{A\<Te)  = 

I-./-T  IX  R^  —  4X(T^ 

mk.)  =  — 

Now  we  calculate  the  probability  that  a  point  which 
is  uniformly  distributed  inside  a  disk  of  radius  2(7e  con¬ 
tributes  value  V  for  am  incorrect  hypothesis,  using  the 
evaluation  function  defined  in  the  previous  section.  As 
before,  we  must  express  a  uniform  distribution  in  polar 
coordinates  and  then  integrate  over  9  to  get  the  distri¬ 
bution  in  terms  of  r  alone,  since  the  evaluation  function 
9  is  a  function  of  r: 


fir,  9)  = 


7r(2(T«)2 

/•2» 


r' _ i_ 

Jo  iri2<Te) 


fiV-H  I  = 


As  before,  we  calculate  the  density  of  (i>  =  (/(r)  | 
A,(Tt)  with  the  new  distribution  for  r  and  get: 

Therefore,  the  density  function  of  u  for  a  fixed  (T,  is; 

(P(A  I  <T,) 

—  ~ u  ~  0 

/(V  I  (Te)PiA  I  (tJ 
_  1  ^  ^  1 

-  TPi-  2771^  575? 

0  otherwise 

Again,  this  expression  correctly  integrates  to  1.  As  be¬ 
fore,  we  need  to  integrate  over  all  values  of  Vt : 

/(l^  =  „)  =  y  /(Vi,  =  T- 1  E  =  0')/j-(E  =  ,T)d„ 

= 

Dealing  with  v  =  0  as  a  separate  case,  and  with  the 
same  bounds  as  before,  integrating  yields: 


/(*%)  = 


where 


We  ran  an  experiment  to  test  the  analysis  of  this  sec¬ 
tion,  and  the  results  are  shown  in  Figure  2.  Both  graphs 
show  a  normalized  histogram  of  the  results  of  15, 000  in¬ 
dependent  trials.  The  first  graph  indicates  the  empirical 
results  corroborating  the  predictions  very  closely  While 
the  comparison  of  the  second  graph  is  less  visually  strik¬ 
ing,  note  that  the  deviation  at  any  point  between  the 
empirical  and  predicted  results  is  generally  less  than  one 
count. 
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Figure  2:  Distributions,  f(v),  with  and  without  model 


8  Finding  the  Weight  Density  of  a 
Model  in  an  Image 

Having  found  the  single  point  densities,  we  use  them  to 
find  the  density  of  the  combined  weight  of  points  for  cor¬ 
rect  and  incorrect  hypotheses.  We  start  with  the  density 
function  on  weights  of  correct  hypotheses.  For  a  model 
of  size  m  and  an  image  of  size  n,  a  correct  hypothesis 
should  have  weight  density 

m— 3  n— m 

/(W„.  =  u)  =  0  f{V„,)  (S  0  fiVff  ) 

«  =  1  i  =  l 


For  an  incorrect  hypothesis  we  look  at  the  problem  in  two 
steps.  First  we  derive,  as  above,  the  mean  and  standard 
deviation  of  the  process  in  which  n  =  m  =  4,  «.e.,a  single 
random  image  point  drops  into  a  single  error  circle.  From 
the  distribution  of  we  find; 


E-ffiv)  =  rvf{Vjf)dv+  f\f{VTf)dv 
Jo  Jli 

J  J i* 


assuming  that  each  point  contributes  weight  to  its  sup¬ 
porting  basis  independently  of  any  other.  In  order  to 
avoid  convolving  the  distributions  from  the  previous  sec¬ 
tion,  we  find  the  expected  value  and  the  standard  devi¬ 
ation  of  the  distributions  and  invoke  the  central  limit 
theorem  to  claim  that  the  combined  weight  of  a  correct 
hypothesis  of  a  size  m  model  in  a  size  n  image  with 
should  roughly  follow  the  distribution; 

N{mEii  +  (n  -  m<T\  -f-  (n  -  rn)(rj^) 


in  which 


Eh(v)  =  f  vfc{v)dv+  f  vfe{v)dv 
Jo  Jli 

+  /  vfe{v)dv~h  f  vfc(v)dv 

Jli  Jli 

^  [i?  “  If] 


=  2.604  X  10"^  X 


(1-c)  ri 
6? 


=  f  v^fe{v)dv+  f  v^fc{v)dv 
Jo  Jli 

f  v^fe(v)dv+  f  v^fe{v)dv 

Jli  Jli 

jLi}..  [1  - 1] 

dOJT^tJc®  [sj  s|J 

(1-c)  [i  _  1' 

b]  [sf  s|. 


+ 


=  (1-^) 


=  1.6845  X  10"®  X 


ffjj  =  Eh{vY  -  Euiy) 


=  .2882  X 


1 

btR-^ 


Eh{v^) 


/ 

Jo  Jli 

fti  fl 

+  /  v^fiVTr)dv  + 

Jh  Jli 


f(VTf)dv 


(e*  - 1)  .rj_  _ 

20e*R^b^7r  [s?  s| 

■  1  1 


Effiv)^  -  E-ffiv^) 


1.554  X  10"®  X 


Plugging  in  the  values  si  =  2.875,52  =  120, 6o  = 
0.35,  bi  =  0.58,  c  =  0,  and  R  =  500  for  the  experimental 
data  of  section  6  yields 

Eh  =  3.26  X  10"® 

<th  =  1-49  X  10"® 

%=3.19x  10"® 

=  2.08  X  10"* 


Note  that  the  value  of  the  limit  S2  was  determined 
empirically  and  is  a  function  of  the  constraints  on  the 
bases  that  are  chosen.  Without  the  basis  constraints. 
52  tends  to  infinity,  and  in  fact  the  values  of  these  pa¬ 
rameters  for  52  =  120  and  52  =  oo  ate  not  significantly 
different. 


Now,  consider  a  single  random  image  point  (i.e., 
n  =  4;  three  for  the  hypothesis  and  one  left  over) 
dropped  into  an  image  where  a  model  of  size  m  >  4 
is  hypothesized  to  be.  In  this  case  the  event  that  the 
random  point  will  contribute  weight  v  to  this  hypothesis 
is  calculated  as  follows;  Let  event  .4i  =  “point  drops  in 
the  ith  circle.”  Then, 

/(%„  =  t/ 1  n  ^  0) 

=  /{t;,  j4i)  + /{u,  i42)  +  . .  .  + /(v,  i4m-3) 

=  (m-3)/(n,^i) 

Note  that  because  we  are  assuming  the  circles  are  dis¬ 
joint,  we  are  overestimating  the  probability  of  the  point 
falling  in  any  circle.  The  actual  rate  of  detection  will 
be  lower  than  our  assumption,  especially  as  the  m  grows 
large. 

e,<v<t2 

=  j  _  l)V^]  f2<u<f3 

e,<v<e, 

0  otherwise 


r^* 

+  /  v-[(m  -  )]i 

Jt, 


As  m  grows  large,  (1  -  (m  —  3)^^[s2  -  si])  <  0  so 
this  expression  is  no  longer  a  density  function.  This  is 
the  point  at  which  the  model  covers  so  much  of  the  im¬ 
age  that  a  random  point  will  always  contribute  to  some 
incorrect  hypothesis.  Therefore,  this  analysis  only  ap- 

plies  to  models  for  which  which  m  <  j  +  3  For 

R  =  500,  m  <  60,  and  for  R  =  256,  m  <  18. 

The  mean  and  standard  deviation  for  one  random 
point  dropping  into  m  —  3  random  circles  is: 

%„(-)  =  +  r  *'/(% 

W  0 

+  / 

=  0+  f  v[(m  -  3)/(K^)]dv 

Jti 

+  /  v[(m  -  3)/(V^)]dv 
Jit 

Jls 

=  (m  -  3)E^v) 

=  f  ^'‘nVT,Jdv+  j\-‘l(VT,Jdv 

Jo  Jti 

«//a  J 

=  0+  v^[{m-mVw)]dv 

Jh 

+  I  \^[{m-mVw)]dv 

Jt2 


%.J ‘''I 

=  (m  -  -  (m  - 

Dropping  n  points  convolves  this  distribution  with  itself 
n  —  3  times: 

«  =  1 

And  therefore  the  weight  that  an  n-size  random  im¬ 
age  contributes  to  an  incorrectly  hypothesized  model  of 
size  m  follows  the  distribution: 

N ((n  -  3)%__^ ,  (n  -  ^)crjf^ ) 

Note  that  this  is  the  weight  density  of  a  single  incorrect 
hypothesis. 

The  means  for  both  distributions  were  tested  empir¬ 
ically  from  the  same  experiment  as  shown  in  Figures  2 
A  table  of  values  is  given  in  figure  3. 

9  Interpreting  the  Results 

We  have  derived  expressions  for  the  weight  densities  of  a 
hypothesis  given  that  it  is  incorrect,  and  given  that  it  is 
correct.  We  are  interested  in  using  these  distributions  to 
determine  the  effectiveness  of  geometric  hashing  under 
different  clutter  conditions.  To  do  this,  we  briefly  intro¬ 
duce  the  ROC  (receiver  operating  characteristic)  curve,  a 
concept  borrowed  from  standard  hypothesis  testing  the¬ 
ory,  and  cast  our  problem  in  terms  of  this  framework. 

9.1  ROC:  Introduction 

The  problem  is  to  decide  which  one  of  two  hypotheses, 
//o  and  /fi,  is  correct.  There  is  a  random  variable  whose 
distribution  is  known  given  one  or  the  other  hypothesis, 
i.e.,  we  know  /(X  |  ffo)  and  /(X  |  Ri).  Let  the  space  of 
all  possible  values  of  the  random  variable  X  be  divided 
into  two  regions,  Zq  and  Zi ,  such  that  we  decide  Ho  if 
the  value  of  X  falls  in  Zo  and  Hi  if  X  falls  in  Zi.  Then 
we  can  define  the  quantities 

Pr(say  Ho  \  Ho  is  true)  =  f  p{X  |  Ho)dX 

J7o 

Pf  —  Pr(say  Hi  |  Ho  is  true)  =  /  p{X  |  Ho)dX 

Jzi 

Pm  =  Pr(say  Ho  |  Hi  is  true)  =  /  p(X  ]  Hi)dX 

JZo 

Pd  =  Pr(say  Hi  \  Hi  is  true)  =  /  p(.\'  |  Hi)dX 

Jzi 

These  quantities  are  often  referred  to  as  Pm—  “Prob¬ 
ability  of  a  miss”,  Pd=  “Probability  of  detection”,  and 
Pf—  “Probability  of  false  alarm”  for  historical  reasons 
One  way  of  constructing  a  decision  rule  is  to  use 
the  likelihood  ratio  test  (LRT)  to  divide  the  observation 
space  into  decision  regions,  i.e.. 


m-3=l,n-3=100 

m-3=l,n-3=500 

ni-3=5,n-3=5 

m-3=10,n-3=lC 

m-3=10,n-3=100 

m-3=10.D-3=500 


3.2177E-3 

3.5339E>3 

4.8115E-3 

1.6089E-2 

3.2177E-2 

3.5052E-2 

4.7828E-2 


Mean 


3.6953 E-3 
3.8383E-3 
4.8026E-3 
1.9658E-2 
4.1986E-2 
4.4513E-2 
5.5476E-2 


m-3=l,n-3=l 

3.2410E-6 

3.194011-6 

m-3=l,n-3=100 

3.0681  E-4 

3.194CE-4 

m-3=l,n-3=500 

1.6344  E-3 

1.5970E-3 

m-3=5,n-3=5 

8.9131E-5 

7.9850E-5 

m-3=10,n'3=10 

3.4949E^4 

3.194CE-4 

m-3=10,n-3=100 

3.5082E-3 

3.1940E-3 

m-3=10,n-3=500 

1.6289E-2 

1.5970E-2 

I.5186E-5 

1.7350E-5 

2.2274E-5 

1.4927E-4 

5.4130E-4 

5.3400E-4 

5.7484E-4 

1.4625E-5 
1.6680E-5 
2.4984 E-5 
7.3124E-5 
1.4625E-4 
1.6485 E-4 
2.4752^4 

I  Variance 

l■■iliS3j| 

1.8747E-8 

1.9738E-6 

1.1163E-5 

6.4808E-7 

2.4001E-6 

2.3277E-5 

1.0766E-4 

2.0760E-8 

2.0760E-6 

1.0380E-5 

5.1797E-7 

2.0668E-6 

2.0668E-5 

1.0334E-4 

Figure  3:  A  table  of  predicted  versus  empirical  means  and  variances  of  the  distribution  ,  =  r),  in  the  top 

table,  and  =  v)  in  the  bottom  table,  for  different  values  of  m  and  n. 


P{X\H,)  > 

p{X\Ho)  <  ^ 

Ho 

That  is,  if  the  ratio  of  the  conditional  densities  is  greater 
than  a  fixed  threshold  r;,  choose  H\,  otherwise  choose 
Hq.  Note  that  changing  the  value  of  t]  changes  the  de¬ 
cision  regions  and  thus  the  values  of  P^-and  Pp.  The 
ROC  curve  is  simply  the  graph  of  Pp  versus  Pf-as  a  func¬ 
tion  of  threshold  for  the  LRT.  As  it  turns  out,  both  the 
Neyman-Pearson  test  and  the  optimal  Bayes  test  involve 
this  LRT,  thus  the  ROC  curve  encapsulates  all  infor¬ 
mation  needed  for  either  test,  since  any  (Pf,Pd)  point 
yielded  by  either  test  necessarily  lies  on  the  ROC  curve. 
If  the  prior  probabilities  of  Ho  and  Hi  are  known,  then 
the  optimal  Bayes  decision  rule  picks  the  ROC  point 
which  minimizes  the  expected  cost  of  the  decision  by  us¬ 
ing  the  LRT  in  which  the  threshold  is  a  function  of  the 
costs  and  priors  involved: 

_  (Cio  —  Coo)Po 
“  (Coi  -  Cu)P, 

where  C,  j  is  the  cost  associated  with  choosing  hypoth¬ 
esis  i  given  that  hypothesis  j  is  correct.  In  the  absence 
of  such  priors,  a  Neymann  Pearson  test  is  often  consid¬ 
ered  optimal,  in  which  one  simply  picks  a  point  on  the 
ROC  curve  which  gives  satisfactory  performance.  Note 
that  this  is  not  the  same  as  minimizing  the  decision’s 
expected  cost. 

For  example,  assume  for  our  problem  that  Ho  ~ 
N(mo,tr§)  and  Hi  ~  N(mi,<rf),  and  assume  that  mj  > 
mo  and  <Ti  >  <ro.  The  likelihood  ratio  test  yields: 

»  Hi 


The  regions  Zo  and  Zi  are  found  by  solving  the  above 
equation  for  equality, 

V.  _  [(”»i<rg  -  mofff)  -  a-oo-iirltrf  -  <r^]  +  (mo  - 

_  - - - - - 

^2  -  2  _  _2 

"1  "0 

The  values  of  P^and  Ppare  found  by  integrating  the 
conditional  probability  densities  p(X  |  Ho)  and  p(X  | 
Hi)  over  these  regions  Zq  and  Zi: 


Pr=  f  p(XiHo)dX  = 

Jz,  Jxi  v27r<To 

Pd=  f  p(X  I  Hi)dX  =  1  -  /  ' 

Jzi  Jxi  v2x<ti 

In  figure  4  for  example,  we  have  plotted  the  ROC 
curve  for  the  distributions  f(X  |  Ho)  and  f(X  |  Hi) 
alongside.  The  axes  are  x  =  Pp,  y  =  Pd-  The  line 
X  =  y  is  n  lower  bound,  since  a  points  on  this  line  indi¬ 
cate  that  any  decision  is  as  likely  to  be  true  as  false,  so 
the  observed  value  of  X  gives  us  no  information.  Though 
an  ROC  curve  is  a  3D  entity  (i.e.,  a  point  in  (Pp,  Pd^  v) 
space),  we  display  its  projection  onto  the  =  0  plane 
and  can  easily  find  the  associated  t]  value  for  any  (Pp, 
Pd)  pair.  When  the  threshold  is  high  there  is  a  0  prob¬ 
ability  of  false  negative,  but  a  0  probability  of  correct 
identification  as  well.  As  the  threshold  goes  down,  the 
probabilities  of  both  occurences  go  up  until  the  thresh¬ 
old  is  so  low  that  both  positive  and  false  identification 
are  certain.  In  our  problem  we  assume  that  we  do  not 
have  priors,  so  our  goal  is  to  pick  a  threshold  such  that 
we  have  a  very  high  probability  of  identification  and  a 


Figure  4:  On  the  left  is  displayed  the  conditional  probability  density  functions  of  a  random  variable  X.  On  the  right 
is  the  associated  ROC  curve,  where  Ppand  Pocorrespond  to  the  x  and  y  zotes,  respectively. 


low  probability  of  false  positives,  t.e.,  we  are  interested 
in  picking  a  point  as  close  to  the  upper  left  hand  side 
as  possible.  Note  that  the  larger  the  separation  between 
the  two  hypothesis  distributions,  the  more  the  curve  is 
pushed  towards  that  direction. 

9.2  Applying  ROC  to  Geometric  Hashing 

In  our  problem  formulation.  Ho  =  probability  that  the 
model  is  not  in  the  image,  and  Hi  =  probability  that  it 
is.  In  our  case,  we  have  a  different  ROC  curve  associated 
with  every  fixed  (m,  n)  pair,  where  m  and  n  are  the 
number  of  model  and  image  features,  respectively. 

The  next  examples  show  the  predicted  comparison 
of  the  Gaussian  model  to  the  bounded  uniform  model. 
Figure  5  shows  the  ROC  curves  for  the  Gaussian  and  uni¬ 
form  models,  m  —  Z  —  10,n  —  3  =  10,50, 100,500, 1000, 
occlusion=0.0  and  0.25.  We  can  see  that  in  the  case 
of  no  occlusion,  for  small  values  of  n,  both  models  pre¬ 
dict  good  Pfvs  Pocurves,  though  the  bounded  uniform 
model  will  always  be  better  because  there  is  no  possi¬ 
bility  of  a  false  negative  for  occlusion=0,  while  in  the 
unbounded  Gaussian  case  there  always  is.  However,  as 
n  increases,  the  uniform  model  breaks  down  more  rapidly 
than  the  Gaussian  model  for  both  occlusion  values.  For 
occlusion=0.25,  both  models  perform  about  equally  for 
small  values  of  n  (for  example,  at  n  =  100),  but  again  as 
n  increases,  the  uniform  error  model  fails  more  dramat¬ 
ically  than  the  Gaussian  model  (n  >  500). 

Using  this  technique,  we  can  predict  thresholds  for 
actual  experiments,  as  shown  in  the  next  section. 

10  Experiment 

The  predictions  of  the  previous  section  were  tested  in  the 
following  experiment:  to  test  an  ROC  curve  for  model 
size  m,  image  size  n,  we  run  two  sets  of  trials,  one  to  test 
the  probability  of  detection  and  one  to  test  the  proba¬ 
bility  of  false  alarm.  For  the  former,  a  random  model 
of  size  m  consisting  of  point  features  was  generated  and 
projected  into  an  image,  with  Gaussian  noise  {<r  =  2.5) 
added  to  both  the  x  and  y  positional  components  of  each 
point  feature.  Occlusion  (c)  is  simulated  by  adding  a  c 
probability  of  not  appearing  in  the  resulting  image  for 
each  point.  Only  correct  correspondences  are  tested,  and 


the  weight  of  each  of  these  correct  hypotheses  is  found 
using  the  algorithm: 

(a)  'or  a  correct  hypothesis  (mo  :  »o;mi  :  r'limj  :  h) 

for  every  other  model  point  rtij 

(i)  find  coordinates  mj  =  (o; ,  Pj )  with  respect 
to  basis  (mo, mi, m2),  and  from  this,  <r,  = 

(ii)  For  every  image  point  ij,  find  the  mini¬ 
mum  distance  d  between  ij  and  any  of  the 

projected  points  such  that  d  <  2<Tt.  Add 
_  ja 

V  =  2^^  *'•  to  the  supporting  weight  for 
this  hypothesis. 

(b)  If  the  weight  of  the  vote  for  this  hypothesis  is  greater 
tham  some  threshold  6,  stop  and  output  this  as  a 
correct  iastance  of  the  model. 

For  our  experiment,  we  loop  through  thresholds  from 
0  to  Eaiy),  and  for  every  threshold  we  run  the  above 
algorithm  enough  times  to  get  100  sample  points.  To 
test  the  probability  of  false  alarm,  we  run  the  satme  ex¬ 
periment  exactly,  except  we  use  random  images  which 
do  not  contain  the  model  we  are  looking  for.  We  loop 
through  the  same  thresholds  as  in  the  previous  case  to 
get  a  set  of  (P>’,Pp)  pairs  for  each  threshold.  The  result¬ 
ing  Pp,  Pd^  and  ROC  curves  are  shown  in  figure  6  for 
n  —  3  =  10, 100, 500, 500,  occlusion  c  =  0.0, 0.0, 0.0, 0.25. 
The  ROC  curves  for  the  same  parameters  are  shown 
alongside. 

In  the  cases  of  no  occlusion,  the  predicted  and  em¬ 
pirical  curves  match  very  nicely.  However,  for  occlu- 
sion=0.25,  the  empirical  ROC  curve  falls  below  our  ex¬ 
pectations.  This  is  due  to  the  fact  that  the  distribution 
of  Wh  has  a  larger  variance  than  our  predicted  value 
(see  table  3  and  figure  7).  In  fact,  though  we  assumed 
at  the  outset  of  the  analysis  that  the  individual  random 
variables  Vh  were  independent,  this  is  not  the  case;  for 
a  correct  basis  matching,  the  joint  distribution  of  any 
two  error  vectors  Cj,  e^,  i,  j  0, 1, 2  can  shown  to  have  a 
non-zero  covariance: 

Aij  =  (1  —  ot,'  ^<)(^  ~  ~  Pj)  d"  d" 

This  leads  to  a  larger  variance  for  the  overall  distri¬ 
bution  than  that  predicted  using  the  independence  as- 
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Figure  6:  Comparison  of  predicted  to  empirical  curves  for  probability  of  false  alarm,  probability  of  detection,  ROC 
curves.  From  top  to  bottom,  n  —  3  =  10, 100,500,500,  occlusion  =  0,0,0,025. 


sumption  and  hence  poorer  results.  We  are  currently 
working  on  another  analysis  that  takes  this  dependence 
into  account. 

11  Conclusion 

The  geometric  hashing  method  was  introduced  by  Lam- 
dan,  Schwartz  and  Wolfson  in  1987.  The  first  error 
analysis  of  the  geometric  hashing  technique  was  done 
by  Crimson,  Huttenlocher  and  Jacobs,  who  showed  that 
with  even  very  small  amounts  of  noise  and  spurious  fear 
tures,  the  technique  had  a  very  high  probability  of  false 
positives.  However,  they  assumed  that  the  error  was  uni¬ 
form  and  bounded,  which  is  a  worst-case  scenario  and 
places  an  upper  bound  on  the  error  rate.  As  we  have 
shown  here,  with  a  Gaussian  error  assumption  we  can 
do  much  better. 

Costa,  Haralick,  and  Shapiro  demonstrated  another 
error  analysis  for  geometric  hashing  [CHS90]  also  based 
on  a  2D  Gaussian  noise  distribution  associated  with  each 
point.  Their  analysis  differs  from  this  one  technically  in 
many  respects,  but  the  main  difference  is  that  they  as¬ 
sume  that  the  model  they  are  looking  for  is  present  in  the 
image  and  they  focus  on  finding  the  pose  by  deriving  an 
optimal  voting  scheme.  This  is  in  contrast  to  the  work 
presented  here,  in  which  given  a  voting  scheme  and  no 
prior  information  about  the  presence  or  absence  of  the 
model,  we  explicitly  derived  the  probability  of  false  de¬ 
tection  as  a  function  of  clutter,  and  characterized  the 
confidence  level  of  the  hypotheses  that  the  method  of¬ 
fers  as  ’’correct”.  We  did  this  by  choosing  a  hypothesis 


evaluation  function,  and  deriving  the  probability  density 
of  the  evaluation  function  on  both  correct  and  incorrect 
hypotheses  to  determine,  when  given  some  hypothesis, 
which  distribution  it  was  drawn  from.  We  showed  also 
that  the  Gaussian  error  model  separates  the  two  distri¬ 
butions  more  than  the  uniform  bounded  error  model, 
leading  to  better  ROC  curves. 

The  contribution  of  this  work  is  to  cast  the  geo¬ 
metric  hashing  technique  in  terms  of  standard  estima¬ 
tion  theory,  which  has  several  advantages.  The  ROC 
curve  formulation  explicitly  demonstrates  the  perfor¬ 
mance  achievable  for  a  given  signal  to  noise  ratio  as  a 
function  of  acceptance  threshold.  Given  a  desired  detec¬ 
tion  rate,  the  user  can  determine  from  the  ROC  curve 
what  acceptance  threshold  to  use  in  order  to  minimize 
the  probability  of  false  detections.  In  this  formulation 
it  is  also  clear  when  adequate  performance  cannot  be 
achieved,  for  if  the  desired  minimum  performance  point 
(Pf,  Pp)  lies  above  the  ROC  curve  for  a  particular  clut¬ 
ter  level,  then  this  performance  is  not  possible  no  matter 
what  operating  parameters  are  chosen.  The  ROC  for¬ 
mulation  is  also  a  succinct  method  for  comparing  voting 
schemes,  as  we  compared  the  voting  schemes  implied  by 
the  Gaussian  versus  uniform  error  models.  We  expect  to 
be  able  to  use  such  techniques  to  choose  thresholds  ana¬ 
lytically  instead  of  heuristically  in  recognition  systems. 
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